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Abstract 

We study the formation of protein-protein encounter complexes with a Langevin equation ap- 
proach that considers direct, steric and thermal forces. As three model systems with distinctly 
different properties we consider the pairs barnase:barstar, cytochrome cxytochrome c peroxidase 
and p53:MDM2. In each case, proteins are modeled either as spherical particles, as dipolar spheres 
or as collection of several small beads with one dipole. Spherical reaction patches are placed on 
the model proteins according to the known experimental structures of the protein complexes. In 
the computer simulations, concentration is varied by changing box size. Encounter is defined as 
overlap of the reaction patches and the corresponding first passage times are recorded together 
with the number of unsuccessful contacts before encounter. We find that encounter frequency 
scales linearly with protein concentration, thus proving that our microscopic model results in a 
well-defined macroscopic encounter rate. The number of unsuccessful contacts before encounter 
decreases with increasing encounter rate and ranges from 20-9000. For all three models, encounter 
rates are obtained within one order of magnitude of the experimentally measured association rates. 
Electrostatic steering enhances association up to 50-fold. If diffusional encounter is dominant 
(p53:MDM2) or similarly important as electrostatic steering (barnase:barstar), then encounter 
rate decreases with decreasing patch radius. More detailed modeling of protein shapes decreases 
encounter rates by 5-95 percent. Our study shows how generic principles of protein-protein associ- 
ation are modulated by molecular features of the systems under consideration. Moreover it allows 
us to assess different coarse-graining strategies for the future modelling of the dynamics of large 
protein complexes. 
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I. INTRODUCTION 



Protein-protein interactions play key roles in many cellular processes such as signal trans- 
duction, bioenergetics, and the immune response [T]. Moreover, many proteins function in 
the context of protein complexes of variable sizes and lifetimes. Examples of such complexes 
are ribosomes, polymerases, spliceosomes, nuclear pore complexes, cytoskeletal structures 
like the mitotic spindle or actin stress fibers, adhesion contacts, the anaphase-promoting 
complex, and the endocytotic complex pj. For yeast, 800 different core complexes have 
been identified, suggesting the existence of 3000 core complexes for humans [3] • In addition 
it has been shown for yeast that most protein complexes are assembled just-in-time during 
the course of the cell cycle [1]. In fact many protein complexes in the cell are highly dynamic, 
with fast turnover of many components. One can argue that their dynamics, although exper- 
imentally very hard to access, is biologically more relevant than their equilibrium properties. 
Therefore a systematic understanding of the dynamics of protein complexes in cells is one 
of the grand challenges in quantitative biology. 

The elementary unit of all of these cellular processes is the bimolecular protein-protein 
interaction. The strength and specificity of protein-protein association are determined by 
the integrated effect of different interactions, including shape complementarity, van der 
Waals interactions, hydrogen bonding, electrostatic interactions and hydrophobic effects. For 
example, the importance of electrostatic interactions has been demonstrated by experimental 
measurement at different ionic strengths [5] . To a first approximation, bimolecular reactions 
are characterized by on- and off-rates. The equilibrium association constant (or affinity) 
then follows as the ratio of the two. From a conceptual point of view, on- and off-rates are 
very different. On-rates are commonly believed to be controlled by the diffusion properties 
as well as by long-ranged electrostatic interactions, whereas off-rates are rather controlled 
by short-ranged interactions like hydrogen bonding and van der Waals forces. 

The main features of the dynamics of protein association can be conceptualized within 
the framework of the encounter complex [6] . To this end, the association is divided into two 
parts. First, mutual entanglement - the encounter complex - is achieved by the proteins 
due to a transport process including mainly diffusion but also electrostatic steering on small 
length scales fl]. If diffusion-controlled, classical continuum approaches can be used to 
describe this part of the process jtSj. To form the final complex, the system then has to 
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overcome a free energy barrier due to local effects like dehydration of the binding interface 
[0]. Due to the various molecular contributions involved in this step, here the two binding 
partners essentially have to be modelled at atomic detail. Moreover the solvent may need to 
be treated explicitly and one might has to account for conformational changes pUj. Thus, 
it appears reasonable to use the encounter complex as a crossover point from a detailed, 
atomistic treatment to a coarse-grained model and vice versa. 

Thermal fluctuations are an essential element of protein-protein encounter because they 
allow the two partners to exhaustively search space for access to the binding interface. From 
the viewpoint of stochastic dynamics, protein-protein association is a first passage time prob- 
lem which can be addressed mathematically in the framework of Langevin equations. The 
application of Langevin equations to association phenomena goes back to early work in the 
colloidal sciences [HI [12]. In these early approaches, the reactants were considered to have 
small spatial extensions and to be uniformly reactive. For large biomolecules like proteins, 
the situation is fundamentally different. Typically, proteins and other biomolecules have 
specific sites on their surface, where a particular binding reaction can take place. There- 
fore, such binding events are subject to intrinsic geometric constraints for every particular 
protein-protein pair or larger assembly. The standard model for ligand-receptor interaction 
was introduced by Berg and Purcell in the context of chemoreception based on the idea of 
using reactive patches to model anisotropic reactivity [13]. Due to anisotropic reactivity, 
also rotational diffusion becomes important. Shoup et al. showed that the effect of rotational 
diffusion can strongly increase the association rate between a receptor with a flat reactive 
patch and uniformly reactive ligands [H] . Later, analytic expressions for the association rate 
between two spherical particles with both carrying a flat axially symmetric |15J and asym- 
metric [m [T7] reactive patch were derived. Similar concepts were also applied by Schulten 
and colleagues [18] . 

For many important aspects, analytical approaches are not possible and computer simu- 
lations are required. This approach has been used early for protein-protein association [19] . 
The importance of electrostatic interactions for long ranged attraction was also emphasized 
by Brownian dynamics simulations of protein-protein encounter [191 EOl EH [22] . If atomic 
structure is taken into account, then successful encounters are deflned by simultaneous ful- 
flUment of two to three distance conditions between opposing residues on the two surfaces 
[25] . Brownian dynamics have also been used for the simulation of high density solutions. 
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e.g. by Bicout and Field who studied a cellular "soup" containing ribosomes, proteins and 
tRNA molecules [21], or by Elcock and coworkers who simulated a crowded cytosol for 10/xs 



In order to develop a quantitative framework for modelling the dynamics of protein 
complexes, it is essential to understand the relative importance of generic principles and 
molecularly determined features of specific systems of interest. Only a good understanding 
of this issues will allow us in the future to develop reasonable coarse-graining strategies to 
address also large complexes of biological relevance. In this study, we therefore address how 
general principles guiding the diffusional association of biomolecular pairs are modulated by 
their particular physicochemical properties. To this end we have selected three molecular 
systems of interest with different steric and electrostatic properties. One of the best studied 
bimolecular complexes is the extracellular ribonuclease Barnase and its intracellular inhibitor 
Barstar. Both proteins carry a net charge of 2e and — 6e, respectively, which leads to a 
considerable electrostatic steering [2S1 123 I2H1 [2HI EO]- Considering the structure of the two 
proteins, Barnase has a bean-like form, matching well on a large reactive area with the 
nearly spherical Barstar. A classic example of electrostatically-driven protein association 
is the iso-l-cytochrome c - cytochrome c peroxidase (Cytc:CCP) complex, charged with 6e 
and — 13e, respectively, and exhibiting dipoles aligned well with the reactive areas fl9\ |3T] . 
Finally, we selected the medically important complex of a peptide fragment of p53 and its 
inhibitor MDM2, which is used for anticancer drug design. In this system, electrostatic 
attraction plays a minor role. On the other hand, the steric match of the two surfaces is of 
particular importance here. It is a perfect example of a key-lock binding interface, where 
p53 is buried deep into a cleft on the MDM2 surface. 

In this paper we systematically explore the effect of various coarse-graining procedures 
on the rate for protein-protein encounter for the three selected model systems. We revisit 
early approaches based on Langevin equations and combine them with current knowledge on 
molecular structure. The paper is organized as follows: In Sect. |TT| we present our different 
stochastic models and describe the methods we use to parameterize the three considered 



bimolecular model systems. Sect. Ill contains the main findings of our study, which are 



discussed and summarized in Sect. IIVI 
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II. MODELS AND METHODS 



A. Modelling proteins at different levels of detail 

One aim of this work is to determine how important specific details of the model proteins 
are with respect to the association properties. Therefore, we considered three different 
levels of detail as depicted in Fig. [T] for the three chosen systems. In the most generic 
approach (A^l), we only considered the steric interaction between spherical particles covered 
with reaction patches. As a first refinement {Ai2), an effective Coulombic interaction was 
introduced using the dipolar sphere model (DSM). Finally, since our Langevin equation 
approach is particularly suited to capture anisotropic transport, we consider a more refined 
version for protein sterics {Ai3). In this approach the excluded volume of each protein was 
modeled by 8-25 smaller beads. Ai3 uses the DSM as well. In Fig. [T| we also show the 
full structures in the bottom row as surface representations, including the locations of the 
binding interfaces. In the following, the general properties of the simulation model and the 
different techniques used in this work will be explained. 



B. Diffusion properties 

The diffusion of the protein model particles is described by an anisotropic 6x6 diffusion 
matrix in all versions of our model. In Ref. [32], de la Torre and coworkers present a method 
to calculate this diffusion matrix from the pdb structure of a protein. This method has been 
implemented in a software called HYDROPRO which is provided online by the same authors 



(http://leonardo.fcu.iim.es/macromol). The basic concept is to put spheres of a certain 
size at the position of any non-H atom. The volume of these spheres effectively models a 
fixed hydration shell. This construct is then filled up with smaller, densely packed, but non- 
overlapping spheres. Since the hydrodynamic properties of a rigid body are determined by its 
outer boundary only, a shell of these small spheres is generated by deleting all spheres which 
have a maximum number of possible neighbors. To this shell a sophisticated technique is 
applied, which has been developed by de la Torre and colleagues over the years, to calculate 
the diffusion matrix of such a cluster of non-overlapping spheres (see references in [22] )■ 
Several system properties are implicitely contained in the mobility matrix, such as ambient 
temperature Ta = 293K, as well as the density and dynamic viscosity of the solvent, where 



we chose the respective parameters of water, p = Ig/cm^ and /i = 10~^Pa s. For simphcity, 
hydrodynamic interactions were not introduced in our models, because the corresponding 
effect on the association rates is expected to be well below 10% [55] . 

C. Langevin equation and simulation method 

For the integration of the Langevin equation, which describes the stochastic motion of the 
particles, we follow an approach which has been recently developed to model cell adhesion 
via reactive receptor patches pH [35] . Let be a six-dimensional vector describing position 
and orientation of a particle at time t. Since the noise due to Brownian motion is additive 
(which means that it does not depend on due to a constant mobility matrix M), the 
Langevin equation is given by: 

dtXt = MF + gt. (1) 

Here, F is a six-dimensional vector containing the force and torque acting on the particle, 
and gt denotes Gaussian white noise: 

iSt) = , {gtgf) = 2kBTaM6{t - t') . (2) 

As explained in App. C of Ref. the Euler algorithm can be used to solve a discretized 
version of this equation: 

X(t + At) = X(t) + MF(t)At + g(At) + C»(At2) . (3) 

For proteins, the typical orders of magnitude are D = lO^^cm^s^^ and R = Inm. Therefore, 
a reasonable choice for the time step is At = Ips, as this leads to a mean step length of 
VDAt = O.Olnm. 

The mobility matrix of a particle is defined in a particle-fixed coordinate system. Thus, 
the whole step has to be calculated in terms of particle-fixed coordinates and then trans- 
formed to the laboratory coordinate space. In particular, this transformation implies a 
rotation R regarding the orientation of the particle. Special attention has to be payed to 
the force F, which is typically calculated in the global frame of reference and hence has to 
be transformed to particle space before Eq. [3] can be evaluated. This back-transformation 
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is achieved by applying to F. Since rotation matrices R simply consist of a list of or- 
thonormal vectors, their inverse is equal to the transposed matrix R^^ = R^. Thus Eq. [s] 
can be rewritten as 

X(t + At) = X(t) + R [M (R^F(t)) At + g(At)] + 0{At^) . (4) 

As F and g are six-dimensional and contain information about torque and rotation, Eq. 
|4]is only formally correct, as R acts on both the translational and rotational parts of the 
respective vectors separately. 

In each step of the simulation, a displacement vector AX(t) is drawn for each particle as 
described above. If this global displacement leads to any violation of the hardcore repulsion, 
all suggested displacements are rejected and new A'X.{t) are calculated. This procedure 
continues, until an update of all positions and orientations is found which does not lead to 
any overlap. In this way, the constraint according to the excluded volume effect is included 
in the stochastic motion. The spherical reactive patches are not taken into account for the 
steric interactions, i.e. they may not only overlap pairwise but also with the model particles. 
One would expect that our procedure leads to errors of order At if two particles are in close 
proximity of order V DAt. However, it has been shown for a different system PS] that in 
practise the deviation from the expected behavior is very small and thus the approach is 
reasonable. 

D. Anisotropic versus isotropic diffusion 

As mentioned before, the 6x6 mobility matrix M represents anisotropic diffusion. For 
large times, anisotropic diffusion crosses over into isotropic diffusion because the information 
about the initial orientation gets lost after a certain relaxation time due to the rotational 
diffusion [37]. In general, translational and rotational diffusion are coupled so that large 
time steps cannot be used. However, for the particular systems studied here, we found that 
the diffusive coupling is a very small effect. In particular, the major entries in the diffusion 
matrix of the proteins used here according to HYDROPRO multiplied with different powers 
of the Stokes radius R ~ 10"^cm to make the dimensions comparable are Du/R^ ~ 10®s~^, 
D rr ~ lO^s ^, Dtr/R ~ lO'^s ^. Therefore, the effect of diffusive coupling is 10 ^ and 10 
smaller than rotational and translational diffusion, respectively. Finally, the typical time 
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scale at which the cross-over is expected can be calculated to be IjiQD^) ^ 10ns. Time 
steps of this magnitude were rarely used in the simulations (see below), so that for most of 
the steps, the anisotropicity is well preserved. Therefore we can safely neglect changes in 
the anisotropicity of the mobility matrix. 

E. System size and time step adaption 

The simulations were performed in a cubic box with periodic boundary conditions. 
Schreiber and Fersht used concentrations between 0.125/iM and 0.5/iM in their experi- 
mental studies of the association rate of the Barnase:Barstar complex [5]. The average 
volume containing one particle at a concentration c is 1/cNa with the Avogadro number 
Na = 6 ■ lO^'^mol^^. Hence, the edge length of a cubic boundary box representing concentra- 
tion c can be calculated from L = \/V = 1/^cNa- E.g. c = 0.125/iM leads to L ^ 2370A for 
one pair of particles, which is two orders of magnitude larger than the size of the proteins. 
Due to this low density, the first passage times (FPT) for encounter can be expected to be 
much longer than the chosen time step. For computational efficiency, we therefore used a 
variable time step in our simulations. Van Zon and ten Wolde suggested a method to avoid 
unwanted collisions when they introduced their Green's function reaction dynamics (GFRD) 
|38j . In contrast to our work, however, this method is based on isotropic diffusion. Gener- 
alizing the GFRD to anisotropic diffusion is out of the scope of our work and we therefore 
used the following scheme. We first note that in GFRD each time step is chosen such that it 
includes the next reaction. In our case, we also want to investigate the stochastic dynamics 
before the next encounter event takes place. Thus a large time step is not chosen to include 
the next encounter, but to bring the system to such a configuration that encounter becomes 
more likely. This step can be well represented by isotropic diffusion with an overall diffusion 
constant D = {Vn + V22 + ^^33)/3 following from the anisotropic diffusion matrix. 

For an isotropic random walk, the displacement probability is given by a Gaussian 
distribution with spherical symmetry. Thus, large spatial steps are exponentially sup- 
pressed, which makes a step of size Ar^^^ > H^/QDAt an if-sigma event. By setting 
^^max ~ min{r^j''} the smallest effective particle distance in the system, where effective 
means the distance of the surfaces r^^ = |rj — r^l — i?j — Rj with Ri determining the maximal 
steric interaction radius of particle i, one can estimate a reasonable time step for which a 
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collision is highly improbable. Van Zon and ten Wolde found that the choice H = 3 provides 
good results, combined with the fact that wrongly sampling a collision event would need a 
certain direction of the displacement in addition to the length. As the particles reach close 
proximity min{r^j^} 0, the estimated At vanishes and thus the simulation would be slowed 
down infinitely. Therefore, there has to be some lower boundary for the time step Attain, 
which is generally chosen to be Atmin = Ips as explained earlier. Thus, the adapted time 
step is given by: 




In practice, most time steps are in the ps-range, with very few time steps coming up to the 
ns-range. 

F. Electrostatic interactions 

Electrostatic interactions are known to play an important role in protein association. To 
study the effect of electrostatics in our generic model, the models A42 and M.3 utihzed 
the dipolar sphere model (DSM), following Refs. [39l HQ]. The DSM effectively models a 
monopole and dipole interaction by summing over the interactions of three charges, one posi- 
tioned in the center of each particle, and two close to its surface in opposite positions. Taking 
into account the Debye screening function due to the presence of counter ions in solution, 
the electrostatic interaction energy between two charges Qi/j at positions Vi/j respectively 
with distance rij = \rij\ = |rj — r^l is: 

Here, k = is the inverse Debye screening length, which typically has a value of ~ 
Inm under physiological conditions. We assume a value of Sr = 78 for the relative static 
permittivity of the medium, which reflects the properties of water at ambient temperature. 
Bij is a correction to the screening of charges which are placed in an object like a protein 
which has no free charges inside. Taking bi/j as the closest distance of qi/j from the surface 
of the surrounding protein, it is approximately given by Bij = bi + bj. This potential leads 
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to a force of charge qj on qi. 



F- ■ 





1 



(1 + f^BM n 



(7) 



As our simulation uses periodic boundary conditions, actually an infinite number of copies 
exists for every charge. However, due to the very fast decay of the screened electrostatic 
interaction, only the minimum image distance of two charges is considered in the force 
calculation. Two model particles m and n feel the sum of the Coulomb forces between all 
pairs of the three complementary charges mimicking the monopolar and dipolar interactions. 
Thus the full force between particle m and n is F^n = Yl^=i X]j=i ^ij^ where i/ j run over 
the charges of m/n respectively. 

As explained earlier, the action of the force on a particle in the Langevin equation is 
weighted with the mobility matrix M. The HYDROPRO software directly gives the diffusion 
matrix V = fc^TaM. This means that in our case the force action should be rewritten as 
MFAt=VF At /ksTa. Considering a time step of At = lOps and a typical diffusion constant 
D = 10~^°mVs, we have D \F{lnm)\ At /keTa ~ lO^^^m and D \F{Anm)\ At /kBT^ ~ 10~^^m 
for typical distances rjj = Inm and r^j = 4nm, respectively. In contrast, the typical step 
length due to the Brownian motion is V DAt ~ 10~^^m. This shows that the magnitude of 
electrostatic interactions at distances of Inm is much smaller than thermal energy. Therefore 
the effect of force is also not considered in our adaptive scheme for the time steps. However, it 
can be expected that the systematic drift, albeit small, will still lead to an altered encounter 
behavior. 

G. Parameterization 

GabdouUine and Wade [23] used several criteria to define contact areas of bimolecular 
protein complexes. In our studies, we define the contact area to consist of those atoms in the 
two interacting proteins that are at SA or less distance from an atom of the complementary 
protein. The center of mass of these atoms is considered as the center of the reactive area. 
For Ail and Ai2, the reactive patch is centered at the surface of the sphere modeling the 
excluded volume such that it has the same relative direction from the center of mass as 
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obtained by the method described. In the case of M.3, the center of the patch is set to the 
center of the reactive area. 

The contact area has a diameter of approximately lOA to 20A for the three systems 
studied here. Following earlier Brownian dynamics simulations with atomistic details [7] we 
have performed an in-depth analysis of the free energy landscape and the encounter state 
of the protein complexes considered in this work (unpublished results). This showed that 
the encounter complex is typically located at relative separations of the two protein surfaces 
of about lOA compared to their positions in the final complex. As the spherical reactive 
patches used in this study simultaneously determine both the size of the contact area on the 
surface and the distance above their surface at which an encounter will be possible, values in 
the range of SA to lOA seem to be reasonable. Note that as long as physical considerations 
do not dictate non-spherical reaction patches, the spherical choice is highly favorable for 
computational efficiency. 

As already stated in the beginning, two types of excluded volume structures are taken 
into account. In the first case, used in All and A42, the proteins are assumed to have an 
approximately spherical form. The radius for the model spheres determining the hard core 
interaction follows as the radius of gyration of the protein, which is also calculated by the 
HYDROPRO software. The underlying data in the more detailed approach Ai3 is obtained 
using the AtoB bead modeling software [HI |12] . In this way, the three-dimensional structure 
of the proteins is modeled with a comparably small number of 8 to 25 spheres of different 
sizes. 

The monopole charge is the sum of all elementary charges in a protein and is placed at 
the center of the respective model particle. The dipole moment p is obtained by summing 
over the product of all atomic charges due to the xyz force field and their relative position 
to the center of mass. In the model, it is represented by two opposing charges which are 
positioned along the direction of p and at a distance = Rgy^ — 4A. The magnitude p' is 
chosen such that |p| = 2pVp. We found that the particular choice of Vp does not have a 
noticeable infiuence on the results. The resulting parameterization is given in Tab. [T]for the 
proteins considered here. 
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III. RESULTS 



A. Encounter frequency and encounter rate 

Langevin dynamics simulations were performed for cubic boxes containing two model 
proteins. Simulations were conducted until the encounter condition was met for the first 
time (typically after milliseconds). Because in our Langevin simulations we measure the 
distribution of first passage times (FPT) to encounter, from which we can deduce the mean 
first passage time (MFPT) (T), the encounter frequency is defined as A; = 1/ (T). This choice 
is motivated by the fact that for a Poisson-like process, the distribution of first passage times 
is given by /(T) = ke~^'^ and therefore the encounter frequency k indeed satisfies k = 1/ (T). 
As the preparation of a comparable experiment would never allow knowing the particular 
initial positions and orientations of the unbound proteins, it makes sense to average over 
the possible initial configurations in the computer simulations. Therefore, we started a large 
number of runs (typically 10^ to 10^) with random initial positions and orientations for all 
involved model particles, under the constraint that the initial pairwise distance is at least 
large enough to prevent an immediate encounter. The "first passage" is defined as the first 
overlap of two complementary reactive patches. Interestingly, due to this averaging the 
first passage process becomes Poisson-like, see Fig. [2j The data show a clear exponential 
behavior. This means that it is justified to use the notion of an "encounter frequency", as 
the FPT distribution is indeed represented by a single stochastic rate. The finite probability 
at small FPT is due to the possibility that the two model particles are already in close 
proximity when the simulation is started. The large errors in the histogram at T are 
caused by the fact that exponentially sized histogram bins were used to sample the behavior 
for small T. Therefore, events hitting a particular bin are rare because of the small width 
of the bins at T — 0, which then leads to bad statistics in this domain. 

As the encounter process is purely diffusion limited in A^l, one would expect the en- 
counter frequency to scale linearly with concentration. Fig. |3] demonstrates for the Bar- 
nase:Barstar system that this is indeed the case. Hence, it is reasonable to always scale 
the encounter frequencies with the inverse concentration, as will be done for the rest of this 
work. We will denote these rescaled quantities as encounter rate, i.e. the encounter rates 
have the dimension M~^s^^. In summary, we have demonstrated here that our microscopic 
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model leads to a well-defined macroscopic encounter rate. 



B. Finite size effects 

In most of the simulations, only one instance of the final complex was considered, i.e. one 
model particle of each kind. Using such small systems could lead to undesired finite size 
effects. We therefore considered the effect of having many particles in the simulation box. 
Fig. |4] shows the simulation results for the encounter frequency k for an increasing number 
of Barnase:Barstar pairs, while keeping the size of the boundary box constant. In order to 
understand the expected effects, consider a system with molecules of Barnase (A and A') 
and two molecules of Barstar (B and B') randomly distributed over the boundary box. The 
relative alignment of any pair of As and Bs is therefore random again. For a particular pair 
the distribution of times to first encounter will thus look very similar to the case with a 
single pair in the box, which is a simple exponential decay with respect to the encounter 
frequency ki. /i(T) = kiexp[—kiT]. The probability that, e.g., the particular pair A — B 
reaches encounter at a certain time t before the three other possible pairs (A' — B, A — B', 
A' — B'), is therefore: 

oo oo oo oo ^ 

p(t) = j j dts j dta j dt4 5{h - H '^i^"'''*' = k^e-^^^' . (8) 

ti t-i ti 

Thus, the probability that any of the four possible particle pairs reaches encounter before the 
respective three other pairs do, is 4 x p{t) as just calculated, i.e. f2(T) has again a Poisson 
form like fi{T) and k2 = 4/ci. In general, for higher numbers of particle pairs A^, we expect 
to again find an exponential distribution of the time to first encounter with the encounter 
frequency k^ = N'^ki. This quadratic behavior is nicely confirmed by the data shown in 
Fig. [4], which suggests that even for small systems with only two particles, no severe finite 
size effects have to be expected. In particular, this rules out that larger numbers of particles 
lead to noticeable three-body interactions or hindering of the encounter process. 

C. Alignment during encounter 

One feature of special interest which we can address with our Langevin equation approach 
is the pathway through which the encounter is formed. We dissected the encounter process 
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into several parts as visualized in Fig. |5| At the start of each run, the systems were prepared 
in the unaligned state Al, as described earlier. A state of close approach which however 
does not allow for binding is called A2. The two model proteins will switch between states 
Al and A2 a number of times A^, until they finally reach the encounter complex ^3 due 
to a favorable combination of translational and rotational diffusion. In the following, each 
occurance of A2 will be termed a contact. Thus A^ counts the number of unsuccessful 
contacts before the encounter is finally formed. A separate set of simulations was performed 
to measure the distribution of A^. Furthermore we analyzed the distribution of return times 
Toff. This is the time it takes for two model proteins to get into contact again {Al^A2) 
after having lost translational alignment {A2^Al), i.e. after they were in close proximity. 
Finally, we determined the distribution of resting times Ton in translational alignment A2 
before the two model particles separated again. 

As an example. Fig. [6] shows the distribution of A^ for the Barnase:Barstar model system 
at c = 0.5/iM in the framework of Ail. Surprisingly the distribution of the number of 
contacts has again a Poisson form. Note that the number of unsuccessful trials in state 
A2 can be rather large (up to 10''). We also found that the distribution of A^ is roughly 
independent of concentration. This is reasonable, as after the two proteins were in contact 
once, the further encounter process is guided by returns to state A2 and thus should be 
more or less independent of system size. 

Fig. [T] shows that the return time Tofj (plotted with the plus-symbol) is not exponentially 
distributed. Instead, it follows a power law p(Toff) ~ T^^^"^ and undergoes an exponential 
cutoff due to the finite size of the boundary box at large Tog. Therefore, there is a high 
probability for very small return times, i.e. situations, where the two model proteins do not 
really separate, but immediately after loosing translational alignment {A2—>-Al) get closer 
again {Al^A2). The power law behavior of the return time is consistent with the problem 
of a random walk to an absorber in three dimensions [151 HI]. In principal, these two 
situations are equivalent since the relative motion of the two proteins while unaligned Al 
can be approximately understood as an isotropic random walk, and the criterion for going 
over to translational alignment A2 reflects an absorbing boundary in the configuration space 
of relative positions. 

The distribution of resting times Ton (plotted with the cross-symbol in Fig. [?]) follows the 
same power law as /(Tofj), but the exponential cutoff occurs much earlier. The reason is 
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that here the cutoff is determined by the region in configuration space where the two model 
proteins are in state A2. As this is much smaller than the whole volume of the boundary 
box, in which they are unaligned and therefore in state ^1, a random walk in state A2 will 
end earlier. 

The differences we obtain in the distributions of Ton and Tos when using the variants 
A42 and Ai3 compared to A^l are generally very small and unlikely to account for any 
deviations in the overall encounter rates. Also, the distribution of is always well described 
by a single exponential decay. However, the inverse decay length (A^) significantly varies 
between the different situations. Therefore, changes in the overall encounter rate are mainly 
caused by a different probability for reaching state ^3 from state A2. This is reasonable 
when considering that the interactions are strongly localized and can thus only act while 
the system is in the aligned state A2. 

D. Three bimolecular systems with different physico-chemical interface properties 

So far we have only considered Barnase:Barstar (SI) to demonstrate how our compu- 
tational model works. We now use our setup for a more comprehensive investigation. In 
particular, we also apply our method to two other systems, cytochrome c and its peroxi- 
dase {S2) as well as the p53:MDM2 complex {S3). Those represent systems with different 
interface characteristics and where the role of electrostatics is either much stronger {S2) or 
much weaker {S3) than for SI. To this end, all the previously described quantities were 
measured for 8 different concentrations c = {125, 250, 500, 750, 1250, 2500, 5000, 7500}pM. 
Furthermore, to find out how the choice of the radius of the reactive patch affects the re- 
sults, we used patch radii of r = 6A and r = 3A in addition to the initially considered value 
of r = lOA. 

Tab. HI] lists the encounter rates k as obtained from these simulations. The rates are 
all roughly of the same order of magnitude. Yet several interesting qualitative features are 
readily apparent. First, for decreasing patch sizes, the rates generally decrease. Second, 
this effect is weaker for A42 compared to A/11, which basically means that the electrostatic 
attraction and orientation due to the dipole interaction are indeed enhancing the encounter. 
The strongest effect of the electrostatic interaction is obtained for Cytc:CCP, which is the 
system with the largest monopole and dipole and the best alignment of the directions of 
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the dipoles and the reactive patches. On the other hand p53:MDM2 is nearly unaffected by 
the effective charges, due to its weak monopole charges and, additionally, an unfavorable 
alignment of the dipolar interaction and the reactive surface area. Furthermore, regarding 
the results with detailed steric structure the effect on the rate is correlated with the 
deviations of the protein forms from the spherical excluded volume approach in A^l and 
A42. This deviation is smallest for Cytc:CCP and largest for p53:MDM2. 

The findings for the encounter rate k are also reflected in the results for ( A^) . As expected, 
an increase in k correlates with a decrease in (N). The only exception is Cytc:CCP observed 
in Ai2, which is also special in regard to the effect of patch size. Here, the effective Coulombic 
interaction is strongest and the dipole moment is best aligned with the reactive patches. 
Therefore, having reached state A2 once, the proteins do systematically orient towards ^3, 
while they are additionally strongly steered back towards A2 when loosing their translational 
alignment. This behavior is the stronger the closer the model proteins have approached once 
- i.e. for the case of small patch sizes, where state A2 implies the smallest distance. While 
this only explains the inversion in the (A^) behavior as a function of patch size r, k is 
obviously still slightly decreasing with smaller patch sizes. This can be explained by the 
fact that the time to the first approach of state A2 is larger for smaller patches, as this 
implies a smaller relative distance. This obviously compensates the fact that afterwards the 
encounter is formed even quicker, as reflected by the decreasing (A^). 

The strong correlation between the encounter rate k and the mean number of contacts 
(A^) is also evident from the correlation plot in Fig. [s} Indeed, k ~ (A^)~^ seems valid for 
most of the different systems and models. It is noteworthy that the prefactor is very similar 
in all cases. Basically, this means that one unsuccessful contact takes the same amount 
of time on average, no matter what the local details of the system are. This gets more 
obvious recalling the distributions of the resting and return times Ton and T^s in Fig. [7| 
which shows that (Tog) > (Ton). As the average time for one contact will be approximately 
(Ton) + (Toflf), it is dominated by Tqs, which is only marginally influenced by the local details 
of the system and the chosen model. Therefore it can be concluded, that for SI and S3 
the incorporation of a more detailed modeling approach influences k and (A^), but not the 
overall characteristics of the encounter process. 

The only exceptions for the clear correlation of k and (A^) are A42 and Ai3 for the case 
Cytc:CCP (52), where k is nearly independent of (A^) because of the strong electrostatic 
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interaction. This is consistent with the earher finding, that the behavior of Cytc:CCP is 
quahtatively different [12], as its electrostatic interactions would facilitate long-lived non- 
specific encounters between the proteins that allowed the severe orientational criteria for 
reaction to be overcome by rotational diffusion. For all three systems studied, in A43 the 
smallest patch size r = SA leads to a somewhat artificial slowing down, because in this case 
an overlap of the patches is rather hindered by the beads modeling the protein structure. 



E. Size of the reaction patches 

We next address the dependence of the data on the size of the reaction patches 
in more detail. This behavior is exemplary studied with the Barnase:Barstar model 
system. In Fig. |9| the encounter frequency has been obtained from simulations for 
Barnase:Barstar-like model particles in the framework of at several concentrations 
Co = {5/iM, 125nM, 2.5nM, 125pM} and varying patch sizes r. All values in the figure have 
been scaled with the concentration, which leads to data collapse. It is obvious that as r gets 
larger than 2R at around r = 40A, the reactive patch covers the whole model particle and we 
therefore cross over to the Smoluchowski limit of isotropic reactivity, where ~ r. However, 
at high densities and large r, the patches span a large part of the simulation box of edge 
length L, and do immediately encounter for a threshold value of r = rmax = where 
the sum of the patch diameters 4r equals the triagonal. Thus, the encounter frequency must 
diverge with ~ l/{rmax — f')"', where we suppose a = 3, as the volume of configurational 
space without immediate encounter is decreasing with r^. This assumption in addition with 
the Smoluchowski behavior would lead to A; ~ r/{rmax — for large r, which follows the 
data in Fig. |9]well (black dashed lines). 

As already mentioned it is well known that the electrostatic interaction of proteins can 
severely increase the association rate. However, under physiological salt conditions, Coulom- 
bic interactions are screened by counter ions in the solution on a small length scale of ap- 
proximately 1 nm. Thus, deviations from case A41 without effective charges will only arise 



for small r. Fig. 10 shows the results of respective simulations for A42 compared to the 
results for 7V11, as considered before. Indeed, for large patch radii r, the results are similar, 
while for smaller r, the encounter rates in Ai2 are clearly higher compared to Ail. However, 
the crossover to a power law behavior with roughly ~ r^/^ can be detected for very small r. 
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but at a prefactor of about 50 times larger than for Ail. 
IV. DISCUSSION 

The main goal of this work was to model protein encounter in a generic framework which 
allows us to include molecular details without making future upscaling to larger complexes 
impossible. Our model approach incorporates steric, electrostatic and thermal interactions 
of the proteins considered. These interactions are thought to be the major factors governing 
protein encounter. Not included are conformational changes of the proteins upon association, 
related entropic terms, and the molecular nature of the surrounding solvent that becomes 
relevant at close distances. The model parameters are extracted from the atomic structures 
available in the protein data bank by generally applicable protocols as described in Sect. |Tl} 
In principle, these methods of data extraction can be fully automatized. 

The biggest advantage of our coarse-grained model is the possibility to extend the sim- 
ulations to large scales in terms of particle numbers, time and system size. In many of the 
earlier studies [39l HHl HHl HI] , the system was prepared already close to encounter and the 
overall association rate was then calculated via a sophisticated path-integral like procedure. 
In contrast, our simulations account for the whole process of diffusional encounter and is 
thus rather general, allowing for spanning large time scales via our adaptive time step al- 
gorithm. In particular, each set of simulations consists of 10^ to 10^ runs of lengths up to 
the order of seconds and could be performed on a standard CPU within hours of computer 
time. 

Being able to directly obtain the first passage times (FPT) of the encounter processes 
in our model allows to check the validity of several phenomenological assumptions. First of 
all, the FPT distribution matched very well a Poisson process with a single stochastic rate, 
as seen in Fig. |2| which validates the notions of encounter and association. Moreover, our 
approach provides two ways of controlling the particle density and for both cases the results 
corresponded well to the expected scaling. First, the concentration is inversely correlated 
with the size of the periodic boundary simulation box. We show that the encounter rate 
grows linearly with the particle concentration. Furthermore, leaving the box size constant, 
the concentration can also be varied by adding a higher number N of particles. Considering 
only the first encounter of any of the possible complementary pairs of model particles, the 
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mean first passage time to this event is not only lowered by a factor but we show that 
the expected behavior is an enhancement of the encounter frequency by A^^, which is nicely 
matched by the results of the simulations. Therefore we can conclude that the computational 
model studied here satisfies the general requirement of stochastic bimolecular association 
processes that describe binding by a single rate constant. 

To test our model against known results we have chosen three well-known bimolecular 
systems with different characteristics. The Barnase: Bar star complex is the gold standard for 
protein-protein association and characterized by relatively strong electrostatic steering. The 
association of Cytochrome c and its peroxidase is even more strongly affected by Coulombic 
attraction. Here, both proteins have a rather spherical form. Finally, the p53:MDM2 com- 
plex has a different characteristic with a very small net charge and a deep cleft perfectly 
matching the small peptide p53, whose reactive area is therefore nearly spanning over its 
whole surface. These model systems were purposely chosen to check whether our effective 
representations of the protein properties would lead to reasonable and significantly distin- 
guishable results. Indeed, this is the the discussion of the results in Tab. [IT] in the 
respective section shows. 

When comparing the results for the encounter rates in Tab. [TT] with previous studies 
from the field of bimolecular protein association, several aspects have to be kept in mind. 
First, throughout this study, we do only consider the encounter of our model particles. As 
explained in the beginning, the complete association of the complex still lacks the step over 
a final free energy barrier, which is due to effects such as the dehydration of the protein 
surfaces and thus requires more detailed modelling. In the framework of our approach, this 
final step could be modelled by a stochastic rate criterion, where the rate can be obtained by 
transition state theory from the energy landscapes characterized in atomistic calculations. 
In any case, any additional process to be included can only lower the values found in our 
study. 

In the work on Barnase:Barstar by Schreiber et al. [5], the authors reported that the 
association between Barnase and Barstar is a diffusion-limited reaction. The argument for 
this is that the association rates at high ionic concentrations in the solution, i.e. for the 
limit in which the electrostatic steering gets negligible, are clearly lowered by the addition 
of glycerol, which will lead to slower diffusion. Assuming diffusion control, the reactive 
step over the final barrier should be kinetically unimportant, as generally discussed in Ref. 
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[M]. Indeed, we see that our results for the encounter rates lead to values in the correct 
order of magnitude of ~ lO^M^^s^^, which is similar to the experimental value obtained 
by Schreiber et al. for the association constant of Barnase:Barstar k = 8 ■ lO^M^^s^^ at 
physiological salt |19]. However, the basal association rate, i.e. the rate at high ionic strength, 
is reported as < 10^M~^s~^ from experiments [5]. Given that the association process of 
Brn:Brs is diffusion limited, these findings should actually coincide with our values for 
Ail. But as we already discussed in the results section, in our simulations the influence of 
the effective electrostatics introduced in 2 do not result in such a drastic change of the 
encounter behavior. 

In several earlier approaches, similar problems have been addressed by computational 
and analytical studies. In work by Zhou and coworkers, basal encounter rates for particles 
with reactive patches have been found to be = 4 ■ 10^M~-'^s~^ and k = 10''M~^s~"'^ [20] . 
that is closer to the basal rates reported by Schreiber and coworkers. It has to be noted that, 
in both cases, the patches were fiat areas above the surface of the spherical model particles, 
which had a smaller angular extension compared to our cases, and especially required a 
much closer translational approach (0.7A in [15]) to form the encounter. If we expand the 
graph in Fig. [o] to smaller patch radii like r = lA, we also find basal rates in the order of 
k = 10'^M~^s~^. Also, the deviation between Ml and Ai2, i.e. the influence of the effective 
electrostatics, is more prominent and could enhance the encounter rate by about two orders 
of magnitude, which is consistent with the findings in the previously cited work. There, 
the effect of Coulombic interaction is reflected with a Boltzmann factor due to a pairwise 
Coulomb energy. This approach works well, as shown in Ref. [17], and has been recently 
used in a more complex model study of the energy landscape of protein-protein association 
[501 EI]. 

Any model for the reaction patches has to rely on results obtained from more detailed 
modeling. The surface of a protein is typically densely covered by water molecules due to 
the hydrophilic nature of its surface. This hydration shell has a thickness of about 3A and 
will therefore in principal hinder the approach of two proteins to distances below 6A. Setting 
the encounter patches to values below this threshold of 3A would then mean that part of the 
dehydration would already have happened before the encounter is actually formed, which 
is probably hardly described by simple diffusion with drift. Moreover, all of the considered 
protein systems feature distinct key-lock binding interfaces regarding the steric structure. 
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apart from some flexibility due to intrinsic thermal motion. Therefore, it makes sense to 
represent the encounter area by a three-dimensional extended object rather than by a flat 
surface region. Indeed, the results of our studies show that our approach is capable of 
reproducing encounter rates in a reasonable order of magnitude, qualitatively reproducing 
generally expected features. In an in-depth investigation of the dependency of the encounter 
rate and the patch radius it is shown, that the choice of the geometry of the reactive area is 
at least as crucial for the results as definition of the model interactions and its parameters. 
In principal, one could think of the patch radius as a valuable tuning parameter to fit 
experimental results and the encounter kinetics in the computational model. 

Our approach makes it possible to observe general features of the encounter process. In 
particular, we dissect the pathway to the encounter complex in several levels of alignment 
between our model proteins. As we observe the full trajectory to encounter in our simula- 
tions, we are able to extract the number of unsuccessful contacts N between the proteins 
until they finally reach a reasonably aligned state to bind. The distribution of N is again 
in all cases well described by a single exponential decay. This behavior is not obvious as 
the probability of success for one contact is depending on several aspects of diffusion in a 
complex manner. First, the closer the rotational alignment at the beginning of the con- 
tact is to the encounter state, the higher is the probability of success. Second, this initial 
ahgnment is also coupled to the last contact if the time in between Toff is small. Finally, 
longer contact resting times Ton also increase the probability of encounter. It is interesting, 
that all these effects still lead to a simple Poisson distribution of the number of contacts N 
when averaging over the initial conditions as it is done in this work. Furthermore, we find 
that the distributions of these resting and return times cannot be described by a Poisson 
process, but are consistent with the expectations for a spatially constricted random walk 
in three dimensions. We find that the particular mean FPT to encounter is in most of the 
cases directly proportional to the number of unsuccessful contacts. This seems to be a very 
fundamental qualitative feature irrespective of the details of the proteins and the applied 
model. However, for Cytc:CCP the behavior is quahtatively different, which is consistent 
with earlier studies of this highly electrostatically steered complex. 

In summary, here we have presented a Langevin equation approach to protein-protein 
association which in principle allows us to combine long simulation times and large systems 
with molecular details of the involved proteins. This first study has focused on bimolecular 
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reactions and has proven that this approach is capable of reproducing known association 
rates with a reasonable dependance on the main parameters involved. One special strength of 
our approach is that it allows us to address the details of the binding pathway, for example 
by measuring the statistics of unsuccessful contacts before encounter. In the future, this 
approach will be extended to large protein complexes of special biological interest. 
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TABLE I: Protein structures and parameters used in the study. The coordinates of the patches 
(fpatch) and the dipole moment (p) are given relative to the center of mass. 



Protein System PDB code Ref. i?gyr/A q/e ^ patch / ^ p/A e 

Barnase 51 Ibrs [231 [52] 14.68 2 (5.43,-4.75,-3.41) (3.84,-0.67,-36.15) 

Barstar 13.42 -6 (-6.05,3.75,6.34) (54.75,-14.39,0.621) 

Cytc 52 2PCC [31] 13.89 6 (-2.08,7.99,-3.83) (-7.03,117.79,-10.99) 

CCP 20.00 -13 (8.89,-8.19,11.46) (-23.47,93.85,-171.82) 

p53 53 lYCR [SH] 10.20 -2 (0.28,0.21,0.59) (21.64,-17.73,-17.12) 

MDM2 16.81 1 (1.577,-4.74,-0.51) (75.07,14.51,4.67) 
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TABLE II: Encounter rates k which have been averaged over several simulations at different 
concentrations as given in the text. The values are given in k / 10^M~^s~^ for the three different 
versions of our model. (A'") are average values for the number of unsuccessful contacts before 
encounter. (N) is basically independent of the concentration. Therefore it is again averaged 
over the different simulations for each of the chosen systems. The errors were determined by one 
standard deviation from the 8 values obtained at different concentrations. Some of the choices 
for the patch radius were not applicable to M3, as for these cases an encounter was completely 
prevented by the detailed excluded volume model. 



System Patch radius 


k{Ml) k{M2) k{M3) {N)iMl) {N){M2) {N){M3) 


Brn:Brs 10.0 
6.0 
3.0 


1.56 ± 0.04 2.76 ± 0.07 2.02 ± 0.02 
0.57 ± 0.01 2.13 ± 0.01 1.34 ± 0.08 
0.13 ± 0.001 1.28 ± 0.03 


474 ±2 198 ± 5 282 ± 10 

1140 ± 4 232 ± 8 534 ± 50 
4120 ± 10 653 ±15 


Cytc:CCP 10.0 
6.0 
3.0 


1.12 ± 0.02 4.31 ± 0.20 4.15 ± 0.15 
0.40 ± 0.01 4.29 ± 0.21 4.05 ± 0.09 
0.09 ± 0.001 4.03 ± 0.05 0.21 ± 0.02 


842 ± 3 61 ± 5 71 ± 7 
2040 ± 20 30 ± 3 77 ± 10 
7540 ± 63 21 ± 3 4160 ± 375 


p53:MDM2 10.0 
6.0 
3.0 


2.05 ± 0.05 2.51 ± 0.04 1.27 ± 0.02 
0.80 ± 0.01 1.12 ± 0.01 0.15 lb 0.01 
0.19 ± 0.002 0.28 ± 0.01 


362 ± 2 266 ± 4 823 ± 10 
815 ± 10 582 ± 13 8720 ± 200 
2900 ± 35 2550 ±30 
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FIG. 1: Scheme to visualize the different variants of the model for the three considered model sys- 
tems. The color code is: yellow for Barnase, cytochrome c and p53; green for Barstar, cytochrome 
c peroxidase and MDM2. The respective reaction patches are shown in white. Ml only includes 
a simple steric interaction. M.2 has an additional effective electrostatic interaction, here denoted 
with red arrows showing the direction of the dipole of the model particles. In A^3, the excluded 
volume is modeled in more detail as a collection of smaller beads. The transparent blue spherical 
surface marks the volume used in A^l and A42 for the sake of comparison. Finally, the bottom 
row shows surface representations of the atomistic structures taken from the protein database. 

FIG. 2: Logarithmic plot of the distribution of the first passage time to encounter T between 
a single pair of Barnase and Barstar model particles in a cubic boundary box of edge length 
L = 2370A, representing a concentration of 0.125/xM for each protein. The dashed line represents 
a single exponential fit to the data points, which shows the expected behavior with respect to the 
encounter frequency k = {T)~^. 

FIG. 3: Simulated encounter frequencies for a single pair of Barnase and Barstar model particles 
in cubic boundary boxes of different sizes representing different concentrations. The dashed line is 
a linear fit to the data. 

FIG. 4: Encounter frequency for different number of Barnase:Barstar pairs leaving the size of 
the boundary box constant. The red data points show the encounter frequencies as obtained from 
simulations, while the green line represents the function CN"^, where N is the number of particle 
pairs and C is a fitted prefactor. 
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FIG. 5: Different alignment states during tlie encounter process. Al proteins are completely 
unaligned. In state A2, referred to as contact in this paper, the proteins are translationally aligned, 
i.e. they are close enough to actually encounter (denoted by the overlap of the lightened area around 
the model particles), but lack the correct orientation. ^3 proteins reached the encounter meaning 
that the reactive patches are in translational and rotational alignment. 



FIG. 6: Logarithmically plotted distribution of the number of approaches N between a Barnase 
and a Barstar particle with incorrect rotational alignment. The dashed line is an exponential fit 
to the data. 



FIG. 7: Double-logarithmic plot of the distribution of resting and return times of the translationally 
aligned state {A2 in Fig. |5]). 



FIG. 8: Correlation plot of encounter rate k and mean number of contacts (A^) with all the data 
from Tab. HIl 



FIG. 9: Encounter rates in dependency of the patch size for the Barnase:Barstar model system in 
the A41 variant. 



FIG. 10: Comparison oi Ail and A42 similar to Fig. [9] for small patch sizes. For larger patch sizes 
there is no substantial difference. 
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